Chapter 6 Convolutional Neural Network (ConvNet)
Architecture of ConvNet
The representative methods for feature extraction include SIFT, HoG, Textons, Spin image, RIFT, and GLOH.
The feature extraction neural network consists of piles of the convolutional layer and pooling layer pairs. The operations of the convolution and pooling layers are conceptually in a two-dimensional plane. This is one of the differences between ConvNet and other neural networks.
Convolution Layer
weight --> 2D convolutional kernel
weighted sum--> 2D convolution
Vertical and horizontal edge detection filters
Sobel filter 
design filters--> learning filters with deep learning 
image = imread('cameraman.tif');
w1 = [0 -1 0; -1 4 -1; 0 -1 0]; % Laplacian filter
w2 = ones(3,3)/9; % Laplacian filter
filteredImage1 = imfilter(image,w1,'same','corr');
filteredImage2 = imfilter(image,w2,'same','corr');
imshow([image filteredImage1 filteredImage2])
Padding in convolution
Strided convolution
Convolutions on RGB images
Example of a layer
fully connected network: 
backpropogation: 
convolution network: 
backpropogation: 
Assume h is 5x5 filter, then we define
. Note that it is correlation but not the classical convolution. 卷积:稀疏交互(或稀疏连接,稀疏权重)与参数共享
Pooling Layer
The pooling layer compensates for eccentric and tilted objects to some extent. For example, the pooling layer can improve the recognition of a cat, which may be off-center in the input image. In addition, as the pooling process reduces the image size, it is highly beneficial for relieving the computational load and preventing overfitting.
Backpropagation of pooling layer
meanpooling
maxpooling
Example: MNIST
The training data is the MNIST database, which contains 70,000 images of handwritten numbers. In general, 60,000 images are used for training, and the remaining 10,000 images are used for the validation test. Each digit image is a 28-by-28 pixel black-and-white image.




Images = loadMNISTImages('t10k-images.idx3-ubyte');
Images = reshape(Images, 28, 28, []);
Labels = loadMNISTLabels('t10k-labels.idx1-ubyte');
Labels(Labels == 0) = 10; % 0 --> 10
W1 = 1e-2*randn([9 9 20]);% 20 convolution kernels
W5 = (2*rand(100, 2000) - 1) * sqrt(6) / sqrt(100 + 2000);
Wo = (2*rand( 10, 100) - 1) * sqrt(6) / sqrt( 10 + 100);
X = Images(:, :, 1:8000);
[W1, W5, Wo] = MnistConv(W1, W5, Wo, X, D);
end
epoch = 1
epoch = 2
epoch = 3
epoch = 4
epoch = 5
epoch = 6
epoch = 7
epoch = 8
epoch = 9
epoch = 10
X = Images(:, :, 8001:10000);
x = X(:, :, k); % Input, 28x28
y1 = Conv(x, W1); % Convolution, 20x20x20
y3 = Pool(y2); % Pool, 10x10x20
y4 = reshape(y3, [], 1); % 2000x1
v5 = W5*y4; % ReLU, 100x1
v = Wo*y5; % Softmax, 10x1
fprintf('Accuracy is %f\n', acc);
plot features
y1 = Conv(x, W1); % Convolution, 20x20x20
y3 = Pool(y2); % Pool, 10x10x20
y4 = reshape(y3, [], 1); % 2000
convFilters = zeros(9*9, 20);
convFilters(:, i) = filter(:);
display_network(convFilters);
title('Convolution Filters')
fList = zeros(20*20, 20);
fList(:, i) = feature(:);
fList = zeros(20*20, 20);
fList(:, i) = feature(:);
title('Features [Convolution]')
fList = zeros(20*20, 20);
fList(:, i) = feature(:);
title('Features [Convolution + ReLU]')
fList = zeros(10*10, 20);
fList(:, i) = feature(:);
title('Features [Convolution + ReLU + MeanPool]')